PSY 5939: Longitudinal Data Analysis

1 Interpreting fixed and random effects

1.1 Review

1.1.1 Linear mixed model

\[Y = \bf{X} \beta + \bf{Z} \gamma + \epsilon\]

\(\bf{X} \beta\) are the fixed effects

\(\bf{Z} \gamma\) are the random effects

\(\epsilon\) is the residual

1.1.2 aka multilevel model

Linear change over time, random intercept, random slope

Level 1 = measurement occasion: Model of observations over time

\[Y_{ij} = \pi_{0i} + \pi_{1i} (Time_{ij}) + e_{ij}\]

Level 2 = participant: Model of participant level differences

\[\pi_{0i} = \beta_{00} + r_{0i}\] \[\pi_{1i} = \beta_{10} + r_{1i}\]

Combined equation

\[Y_{ij} = \beta_{00} + \beta_{10} (Time_{ij}) + r_{0i} + r_{1i} (Time_{ij}) + e_{ij}\]

1.1.3 Model results - R

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: Y ~ 1 + time1 + (1 + time1 | subject)
##    Data: tall_data
## 
##      AIC      BIC   logLik deviance df.resid 
##   1345.8   1365.6   -666.9   1333.8      194 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.37379 -0.55722  0.01556  0.49778  2.58483 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  subject  (Intercept)  4.94    2.223         
##           time1       10.50    3.240    -0.15
##  Residual             27.50    5.244         
## Number of obs: 200, groups:  subject, 50
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 100.5470     0.6955 144.561
## time1         5.3124     0.5657   9.391
## 
## Correlation of Fixed Effects:
##       (Intr)
## time1 -0.476

1.1.4 Model results - SPSS

1.1.5 Individual (linear) trajectories

Spaghetti plot

1.2 Fixed effects

1.2.1 Fixed effects

Using functions built in to lme4

fixef(model1)
## (Intercept)       time1 
##  100.546956    5.312409

Using the broom package

tidy(model1, effects = "fixed", conf.int = TRUE)
## # A tibble: 2 × 7
##   effect term        estimate std.error statistic conf.low conf.high
##   <chr>  <chr>          <dbl>     <dbl>     <dbl>    <dbl>     <dbl>
## 1 fixed  (Intercept)   101.       0.696    145.      99.2     102.  
## 2 fixed  time1           5.31     0.566      9.39     4.20      6.42

1.2.2 Spaghetti plot with fixed effects

Spaghetti plot with fixed effects

1.3 Random effects

1.3.1 Random effects

Using functions built in to lme4

print(VarCorr(model1),comp=c("Variance", "Std.Dev."))
##  Groups   Name        Variance Std.Dev. Corr  
##  subject  (Intercept)  4.9395  2.2225         
##           time1       10.4992  3.2402   -0.155
##  Residual             27.4984  5.2439

Using the broom package

tidy(model1, effects = "ran_pars", scales = "vcov")
## # A tibble: 4 × 4
##   effect   group    term                   estimate
##   <chr>    <chr>    <chr>                     <dbl>
## 1 ran_pars subject  var__(Intercept)           4.94
## 2 ran_pars subject  cov__(Intercept).time1    -1.12
## 3 ran_pars subject  var__time1                10.5 
## 4 ran_pars Residual var__Observation          27.5

1.3.2 Covariance as correlation

Covariance between intercept and slope as a correlation

\[\frac{\sigma_{r_{0i}r_{1i}}}{\sqrt{\sigma_{r_{0i}}^2} \sqrt{\sigma_{r_{1i}}^2}}\]

Also remember that a standard deviation is the square root of a variance

1.3.3 Prediction interval for individuals

Interval for likely values of individual intercepts and slopes

95% of individual intercepts are in

95% of individual slopes are in

1.3.4 Prediction interval for individuals

[\(\beta_{00} - 1.96 \times \sqrt{\sigma_{r_{0i}}^2}\), \(\beta_{00} + 1.96 \times \sqrt{\sigma_{r_{0i}}^2}\)]

[\(100.547 - 1.96 \times \sqrt{4.94}\), \(100.547 + 1.96 \times \sqrt{4.94}\)]

95% of individual intercepts are in [\(96.191\), \(104.903\)]

[\(\beta_{10} - 1.96 \times \sqrt{\sigma_{r_{1i}}^2}\), \(\beta_{10} + 1.96 \times \sqrt{\sigma_{r_{1i}}^2}\)]

[\(5.312 - 1.96 \times \sqrt{10.499}\), \(5.312 + 1.96 \times \sqrt{10.499}\)]

95% of individual slopes are in [\(-1.038\), \(11.663\)]

1.3.5 Spaghetti plot with fixed and random intercepts

95% of individual intercepts range from \(96.191\) to \(104.903\)

Spaghetti plot with fixed and random intercepts

1.3.6 Spaghetti plot with fixed and random slopes

95% of individual slopes range from \(-1.038\) to \(11.663\)

Spaghetti plot with fixed and random slopes

2 ICC and model comparison

2.1 Intra-class correlation (ICC)

2.1.1 Quantifying similarity

Previously:

Non-independence due to repeated measuring the same individual

The intraclass correlation (ICC) quantifies this “more alike” ness

2.1.2 Intraclass correlation

2.1.3 Unconditional model

Also called “random effects ANOVA”

Level 1 model: \[Y_{ij} = \pi_{0i} + e_{ij}\]

Level 2 model: \[\pi_{0i} = \beta_{00} + r_{0i}\]

Combined model: \[Y_{ij} = \beta_{00} + r_{0i} + e_{ij}\]

Overall mean (\(\beta_{00}\)) + variability between people (var(\(r_{0i}\)) + variability in observations over time (from the same person) (var(\(e_{ij}\)))

2.1.4 Calculating ICC

Use the level 2 intercept variance: \(\sigma_{r_{0i}}^2\)

And the level 1 residual variance: \(\sigma_{e}^2\)

\[ICC = \frac{\sigma_{r_{0i}}^2}{\sigma_{r_{0i}}^2 + \sigma_{e}^2} \]

\(ICC\) is the proportion of variance that is due to differences between people

\(1 - ICC\) is the proportion of variance due to errors in predicting individuals over time

2.1.5 Unconditional model

uncond <- lmer(Y ~ 1 + (1|subject), tall_data, REML = "false")
summary(uncond)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: Y ~ 1 + (1 | subject)
##    Data: tall_data
## 
##      AIC      BIC   logLik deviance df.resid 
##   1494.6   1504.5   -744.3   1488.6      197 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2096 -0.6350 -0.1259  0.6475  2.5714 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept)  9.082   3.014   
##  Residual             92.033   9.593   
## Number of obs: 200, groups:  subject, 50
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 108.5156     0.8011   135.5

2.1.6 Calculating ICC

\[ICC = \frac{\sigma_{r_{0i}}^2}{\sigma_{r_{0i}}^2 + \sigma_{e}^2} = \frac{9.0818}{9.0818 + 92.0331} = \frac{9.0818}{101.1149} = 0.0898\]

9% of the variance in the outcome is between people

91% of the variance is within people

2.1.7 Design effect

The design effect is how much standard errors still be underestimated

\[D_{eff} = 1 + (n - 1) * ICC\]

where \(n\) is the cluster size

In this example, \(D_{eff} = 1 + (n - 1) * ICC = 1 + (4 - 1) * 0.0898 = 1.269\)

2.2 Likelihood ratio (LR) test

2.2.1 Deviance

Mixed models are estimated with maximum likelihood (ML)

Unlike linear regression, no sums of squares (SS)

Maximum likelihood instead provides deviance (also called -2 log likelihood)

2.2.2 Likelihood ratio (LR) test

Deviance = -2 \(\times\) log-likelihood

Difference in deviance between two models is \(\sim \chi^2\) with degrees of freedom equal to the difference in number of parameters estimated

2.2.3 Unconditional versus linear time model

Linear model parameters = 6: \(\beta_{00}\), \(\beta_{10}\), \(\sigma_{r_{0i}}^2\), \(\sigma_{r_{1i}}^2\), \(\sigma_{r_{0i}r_{1i}}\), \(\sigma_e^2\)

Unconditional model parameters = 3: \(\beta_{00}\), \(\sigma_{r_{0i}}^2\), \(\sigma_e^2\)

Model # parameters deviance
Linear time 6 1333.8436
Unconditional 3 1488.6397
(Absolute) difference 3 154.7962

Critical value for \(\chi^2(3) = 7.815\)

The more complex model (linear time) is significantly better than the simpler (unconditional) model

2.3 Variance explained

2.3.1 \(R^2\) analogues

Mixed models produce deviance

Pseudo-\(R^2\) values for mixed models

2.3.2 Variance explained and variance reduction

When you compare your model to the unconditional model

When you compare your model to some other (simpler) model

2.3.3 Variance explained and variance reduction

Model 1 is simpler, Model 2 is more complex

Reduction in variance =

\[\frac{\sigma_{e}^2(Model1) - \sigma_{e}^2(Model2)}{\sigma_{e}^2(Model1)}\]

2.3.4 Variance explained example

Unconditional model: \(\sigma_e^2 =\) 92.0331

Linear model with time centered at first wave: \(\sigma_e^2 =\) 27.4984

Variance explained = \(\frac{\sigma_{e}^2(uncond) - \sigma_{e}^2(linear)}{\sigma_{e}^2(uncond)} = (92.0331 - 27.4984) / (92.0331) = 0.7012\)

Interpret similar to \(R^2\)

3 Adding predictors

3.1 Predictors of growth

3.1.1 Growth parameters

The model so far…

Describe growth (and individual variability in growth)

3.1.2 Predictors of growth

Mixed models can have predictors at different levels

Level 2 predictors are easy, but level 1 predictors are a bit more complex

3.2 Level 2 predictors

3.2.1 Level 2 predictors

Time invariant predictors don’t change with time

3.2.2 Level 2 predictors

Level 2 predictors go in the level 2 equation(s)

\[\pi_{0i} = \beta_{00} + \beta_{01} (L2PRED) + r_{0i}\]

\[\pi_{1i} = \beta_{10} + \beta_{11} (L2PRED) + r_{1i}\]

\(\beta_{01}\) = the effect of L2PRED on average intercept value

\(\beta_{11}\) = the effect of L2PRED on average slope value

You can add a predictor of the intercept, the slope, or both

3.2.3 Level 2 predictors

Level 1: \[Y_{ij} = \pi_{0i} + \pi_{1i} (time_{ij}) + e_{ij}\]

Level 2: \[\pi_{0i} = \beta_{00} + \beta_{01} (L2PRED) + r_{0i}\] \[\pi_{1i} = \beta_{10} + \beta_{11} (L2PRED) + r_{1i}\]

Combined model: \[Y_{ij} = \beta_{00} + \beta_{01} (L2PRED) + \beta_{10} (time_{ij}) + \underline{\beta_{11} (L2PRED) (time_{ij})} + r_{0i} + r_{1i} (time_{ij}) + e_{ij}\]

3.2.4 Centering level 2 predictors

You don’t need to center level 2 predictors if they are

If your level 2 predictor is coded another way, center it

3.2.5 Interpreting fixed effects

I’m going to work with a model with only a random intercept here.

Combined model:

\[Y_{ij} = \beta_{00} + \beta_{01} (L2PRED) + \beta_{10} (time_{ij}) + \beta_{11} (L2PRED) (time_{ij}) + r_{0i} + e_{ij}\]

Just the fixed effects:

\[Y_{ij} = \beta_{00} + \beta_{01} (L2PRED) + \beta_{10} (time_{ij}) + \beta_{11} (L2PRED) (time_{ij})\]

This should look familiar

3.2.6 Example level 2 predictor

Remember that there is a level 2 variable in the dataset: group

head(tall_data, n = 12)
## # A tibble: 12 × 5
##    subject group  time     Y time1
##      <dbl> <dbl> <dbl> <dbl> <dbl>
##  1       1     1     0 101.      0
##  2       1     1     1 113.      1
##  3       1     1     2 124.      2
##  4       1     1     3 128.      3
##  5       2     0     0  95.3     0
##  6       2     0     1 106.      1
##  7       2     0     2 111.      2
##  8       2     0     3 107.      3
##  9       3     1     0  96.4     0
## 10       3     1     1  99.1     1
## 11       3     1     2 103.      2
## 12       3     1     3 110.      3

Combined model:

\[Y_{ij} = \beta_{00} + \beta_{01} (group) + \beta_{10} (time_{ij}) + \beta_{11} (group) (time_{ij}) + r_{0i} + e_{ij}\]

3.2.7 Model with level 2 predictor

model2 <- lmer(Y ~ 1 + group + time1 + time1*group + (1|subject), tall_data, REML = "false")
summary(model2)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: Y ~ 1 + group + time1 + time1 * group + (1 | subject)
##    Data: tall_data
## 
##      AIC      BIC   logLik deviance df.resid 
##   1295.2   1315.0   -641.6   1283.2      194 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.20536 -0.58245  0.01897  0.55369  2.42595 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept)  8.874   2.979   
##  Residual             29.368   5.419   
## Number of obs: 200, groups:  subject, 50
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 101.2629     1.1839  85.537
## group        -1.2344     1.5545  -0.794
## time1         1.7138     0.5289   3.241
## group:time1   6.2045     0.6944   8.935
## 
## Correlation of Fixed Effects:
##             (Intr) group  time1 
## group       -0.762              
## time1       -0.670  0.510       
## group:time1  0.510 -0.670 -0.762

3.2.8 Interpreting fixed effects

\[Y_{ij} = \beta_{00} + \beta_{01} (group) + \beta_{10} (time_{ij}) + \beta_{11} (group) (time_{ij}) =\]

\[101.263 + -1.234 (group) + 1.714 (time_{ij}) + 6.205 (group)(time_{ij})\]

Interpretation of the fixed effects plays out the same as a 2 predictor regression with continuous (time) and categorical (group) predictors and an interaction

\(\beta_{00}\): Mean \(Y\) value for group = 0 when time1 = 0

\(\beta_{01}\): Difference between groups when time1 = 0

3.2.9 Interpreting fixed effects

\[Y_{ij} = \beta_{00} + \beta_{01} (group) + \beta_{10} (time_{ij}) + \beta_{11} (group) (time_{ij}) =\]

\[101.263 + -1.234 (group) + 1.714 (time_{ij}) + 6.205 (group)(time_{ij})\]

Interpretation of the fixed effects plays out the same as a 2 predictor regression with continuous (time) and categorical (group) predictors and an interaction

\(\beta_{10}\): Linear change over time for group = 0

\(\beta_{11}\): Difference between groups in linear change over time

3.2.10 Simple slopes

When group = 0 (control):

\[101.263 + -1.234 (group) + 1.714 (time_{ij}) + 6.205 (group)(time_{ij})\]

\[101.263 + -1.234 (0) + 1.714 (time_{ij}) + 6.205 (0)(time_{ij})\]

\[101.263 + 1.714 (time_{ij})\]

When group = 1 (treatment):

\[101.263 + -1.234 (group) + 1.714 (time_{ij}) + 6.205 (group)(time_{ij})\]

\[101.263 + -1.234 (1) + 1.714 (time_{ij}) + 6.205 (1)(time_{ij})\]

\[101.263 + -1.234 + 1.714 (time_{ij}) + 6.205(time_{ij})\]

\[100.029 + 7.919(time_{ij})\]

3.2.11 Spaghetti plot with fixed effects

Spaghetti plot with fixed effects by group

3.2.12 Random effects

print(VarCorr(model2), comp = c("Variance", "Std.Dev"))
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept)  8.8743  2.9790  
##  Residual             29.3675  5.4192

[\(\beta_{00} - 1.96 \times \sqrt{\sigma_{r_{0i}}^2}\), \(\beta_{00} + 1.96 \times \sqrt{\sigma_{r_{0i}}^2}\)]

95% of individual intercepts in the control group are in:

\[[101.263 - 1.96 \times \sqrt{8.874}, 101.263 + 1.96 \times \sqrt{8.874}]\]

\[[95.424, 107.102]\]

95% of individual intercepts in the treatment group are in:

\[[100.029 - 1.96 \times \sqrt{8.874}, 100.029 + 1.96 \times \sqrt{8.874}]\]

\[[94.19, 105.867]\]

3.2.13 Spaghetti plot with fixed and random effects

Spaghetti plot with fixed effects and random intercepts by group